Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

moves shred deduper inside the thread-pool #4179

Merged
merged 2 commits into from
Jan 3, 2025

Conversation

behzadnouri
Copy link

Problem

Historically shred deduping was done outside the thread-pool because the deduper required a mutable refrences and/or locking. Current deduper uses atomic operations and can be used inside the thread-pool without creating contention.

Summary of Changes

  • The commit moves shred deduper inside the thread-pool.
  • Also removing some of the intermediate allocations.

@behzadnouri behzadnouri force-pushed the retransmit-par-dedup branch 2 times, most recently from 94f54e5 to 6b5c516 Compare December 19, 2024 16:52
Copy link

@gregcusack gregcusack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks mostly good. just one question/comment. Sorry for the delay on this review.

root_bank: &Bank,
cluster_nodes: &ClusterNodes<RetransmitStage>,
shred_deduper: &ShredDeduper<2>,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is 2 here (or K in `ShredDeduper)? and would it make sense to make it a constant with a descriptive name?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is the number of hashers used in inner Deduper.
I added a default value, so that the call-sites do not have to repeat that. see last commit here: 47c84f7

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good but should it be a const default such as:

const DEDUP_HASHERS: usize = 2;

and then:

struct ShredDeduper<const K: usize = DEDUP_HASHERS> {
    deduper: Deduper<K, /*shred:*/ [u8]>,
    shred_id_filter: Deduper<K, (ShredId, /*0..MAX_DUPLICATE_COUNT:*/ usize)>,
}

just not a huge fan of standalone numbers in code if we can avoid it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I slightly prefer to keep it inline because it is more evident what the default is equal to (you don't need to jump elsewhere in the code to find that out).

Either way, ShredDeduper<K> definition is independent of this change. So I think it would be better to revise this in a separate PR if needed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ya i get the not wanting to jump around in code. just prefer the naming clarity over the cost to jump. but ya can be done in separate PR if needed in future.

Historically shred deduping was done outside the thread-pool because the
deduper required a mutable reference and/or locking.
Current deduper uses atomic operations and can be used inside the
thread-pool without creating contention.
Copy link

@gregcusack gregcusack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@behzadnouri behzadnouri merged commit eda57c9 into anza-xyz:master Jan 3, 2025
40 checks passed
@behzadnouri behzadnouri deleted the retransmit-par-dedup branch January 3, 2025 22:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants